On Designing Optimal Parallel Triangular Solvers 1

نویسنده

  • Eunice E. Santos
چکیده

This paper explores the problem of solving triangular linear systems on parallel distributed-memory machines. Working within the LogP model, tight asymptotic bounds for solving these systems using forward/backward substitution are presented. Specifically, lower bounds on execution time independent of the data layout, lower bounds for data layouts in which the number of data items per processor is bounded, and lower bounds for specific data layouts commonly used in designing parallel algorithms for this problem are presented in this paper. Furthermore, algorithms are provided which have running times within a constant factor of the lower bounds described. One interesting result is that the popular two-dimensional block matrix layout necessarily results in significantly longer running times than simpler one-dimensional schemes. Finally, a generalization of the lower bounds to banded triangular linear systems is presented. © 2000 Academic Press

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Triangular Solvers on GPU

In this paper, we investigate GPU based parallel triangular solvers systematically. The parallel triangular solvers are fundamental to incomplete LU factorization family preconditioners and algebraic multigrid solvers. We develop a new matrix format suitable for GPU devices. Parallel lower triangular solvers and upper triangular solvers are developed for this new data structure. With these solv...

متن کامل

Generalizing the Implementation of an Optimal Parallel Recursive Algorithm for Triangular Matrix Inversion

This paper describes a generalization of a study on an implementation of a parallel divide and conquer algorithm for triangular matrix inversion [3]. Indeed, given an original (lower) triangular matrix of size n=m2 (m, k ≥ 1) and an available number of processors p power of 2 (<n), we designed a strong cost optimal parallel algorithm i.e. whose efficiency (resp. speedup) is equal to 1 (resp. p)...

متن کامل

Development of Krylov and AMG Linear Solvers for Large-Scale Sparse Matrices on GPUs

This research introduce our work on developing Krylov subspace and AMG solvers on NVIDIA GPUs. As SpMV is a crucial part for these iterative methods, SpMV algorithms for single GPU and multiple GPUs are implemented. A HEC matrix format and a communication mechanism are established. And also, a set of specific algorithms for solving preconditioned systems in parallel environments are designed, i...

متن کامل

Dense Triangular Solvers on Multicore Clusters using UPC

The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality. This paper describes the implementation of efficient parallel dense triangular solvers in the PGAS language Unified Parallel C (UPC). The solvers are built on top of sequential BLAS functi...

متن کامل

Parallel Algorithms and Condition Estimators for Standard and Generalized Triangular Sylvester-Type Matrix Equations

We discuss parallel algorithms for solving eight common standard and generalized triangular Sylvester-type matrix equation. Our parallel algorithms are based on explicit blocking, 2D block-cyclic data distribution of the matrices and wavefront-like traversal of the right hand side matrices while solving small-sized matrix equations at different nodes and updating the rest of the right hand side...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005